Auditory-based filter-bank analysis as a front-end processor for speech recognition

نویسندگان

Hiroshi Hamada

Tatsuya Hirahara

Akihiro Imamura

Tatsuo Matsuoka

Ryohei Nakatsu

چکیده

A comparison of speech analysis based on human auditory processing and conventional LPC analysis is described. A comparison was made of the capabilities of these two types of parameters to recognize fourteen consonants extracted from Japanese consonant-vowel (CV) syllables spoken in isolation. Tree types of recognition algorithms were used: Dynamic time-warping with multiple template sets, hidden Markov models, and neural networks. The auditory system consisted of 35 channels, spanning from 100 to 5400 Hz, each of which consisted of a critical bandpass fittering process, a rectification process, an integration process, and a transformation into logarithmic form. A lateral inhibition process was also included in order to more closely simulate human auditory processing. The recognition experiments showed that parameters based on the features of human auditory processing are excellent for use in various types of speech recognition methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efferent-Inspired Auditory Model Front-End for Speech Recognition

In this paper, we investigate a closed-loop auditory model and explore its potential as a feature representation for speech recognition. The closed-loop representation consists of an auditory-based, efferent-inspired feedback mechanism that regulates the operating point of a filter bank, thus enabling it to dynamically adapt to changing background noise. With dynamic adaptation, the closed-loop...

متن کامل

Auditory Based Feature Vectors for Speech Recognition Systems

Signal processing front end for extracting the feature set is an important stage in any speech recognition system. The optimum feature set is still not yet decided though the vast efforts of researchers. There are many types of features, which are derived differently and have good impact on the recognition rate. This paper presents one more successful technique to extract the feature set from a...

متن کامل

Improving the filter bank of a classic speech feature extraction algorithm

The most popular speech feature extractor used in automatic speech recognition (ASR) systems today is the mel frequency cepstral coefficient (mfcc) algorithm. Introduced in 1980, the filter bank-based algorithm eventually replaced linear prediction cepstral coefficients (lpcc) as the premier front end, primarily because of mfcc’s superior robustness to additive noise. However, mfcc does not app...

متن کامل

Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end

The aim of this work is to enable a noise-free time-domain speech signal to be reconstructed from a stream of MFCC vectors and fundamental frequency and voicing estimates, such as may be received in a distributed speech recognition system. To facilitate reconstruction, both a sinusoidal model and a source-filter model of speech are compared by listening tests and spectrogram analysis, with the ...

متن کامل

A Weighted Overlap Add-based Front-end for Speech Recognition

Speech signal enhancement is frequently referred to as a preprocessing step to speech recognition. However, in practice, this cannot be easily accomplished since the front-end signal processing techniques and/or parameters used in these two frequently differ. We apply a signal processing technique successfully used in speech enhancement to speech recognition and show that it can perform equally...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1989

Auditory-based filter-bank analysis as a front-end processor for speech recognition

نویسندگان

چکیده

منابع مشابه

An Efferent-Inspired Auditory Model Front-End for Speech Recognition

Auditory Based Feature Vectors for Speech Recognition Systems

Improving the filter bank of a classic speech feature extraction algorithm

Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end

A Weighted Overlap Add-based Front-end for Speech Recognition

عنوان ژورنال:

اشتراک گذاری